Fix information leakage of confidential features in Search API#6276
Merged
DanielRyanSmith merged 7 commits intomainfrom Apr 29, 2026
Merged
Fix information leakage of confidential features in Search API#6276DanielRyanSmith merged 7 commits intomainfrom
DanielRyanSmith merged 7 commits intomainfrom
Conversation
ed435f7 to
b8dc38f
Compare
Collaborator
Author
|
CodeQL seems to be upset with existing code |
Collaborator
Author
|
I'll fix the CodeQL issues in this PR as well if possible |
a33d293 to
3a5d152
Compare
3a5d152 to
810e001
Compare
This commit addresses two major issues introduced in the previous fix for the search API information leakage: 1. Information Leakage via Cache: Caching is enabled for `name_only=True` queries, but the cache key didn't partition based on the current user. Thus, if an authorized user cached a confidential result, an unauthorized user could retrieve it from the cache. The cache key has been updated to include the user's email to securely partition the results. 2. Unscalable Model Fetching: The previous implementation used `ndb.get_multi` to fetch all matching confidential `FeatureEntry` models into memory to perform permission filtering. This has been optimized to evaluate permissions via efficient integer ID set subtraction using existing participant cache keys and admin checks, completely avoiding the expensive `ndb.get_multi` call.
0e01135 to
d7525c1
Compare
Bypasses the Datastore query for confidential features if the user has global permission to view all confidential features. This saves resources by avoiding unnecessary database reads and processing for internal Google/Chromium users and site administrators.
Instead of partitioning the search cache by individual user emails, which causes severe cache fragmentation and degrades performance, the cache is now partitioned by access level (`global_access`, `no_access`, or the user's email if they participate in any features). This ensures that the majority of regular users and unauthenticated users share the same cache, improving hit rates and preserving Search API performance.
jrobbins
approved these changes
Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes an information leakage vulnerability in the Search API where
total_countwas calculated before filtering out confidential features.Root Cause / Motivation
During a security review, it was discovered that the
total_countproperty of the Search API response was calculating the total number of features matching a query across the entire datastore before applying confidentiality filtering on the current page. This could be abused to leak the existence or total number of confidential features matching a specific query by an unprivileged user.To fix this,
internals/search.pynow leverages a parallel datastore query to identify all confidential features matching the query, and filters out the ones the user doesn't have explicit permission to view from theresult_id_setbefore thetotal_countis evaluated and the results are paginated.Optimizations
global_access,no_access, or the user's email if they participate in any features). This ensures the majority of users safely share the same cached results.Detailed Changelog
internals/search.py:confidential_future_opsto parallelize finding confidential features that matched the user's query. Those results are dynamically cross-checked with the user's view permissions, and explicitly stripped out ofresult_id_setprior to pagination and total count length check.confidential_future_opsand intersection filtering ifis_google_or_chromium_account(user)orcan_admin_site(user)is true.make_cache_keyto partition cache entries by access level (global_access,no_access, or user email) rather than partitioning strictly by user email, restoring search cache performance.internals/search_test.py:test_make_cache_keyassertions to use the expectedno_accesspartition string instead ofanonymous.